Adaptive Fleet Scaling System
Overview
The Adaptive Scaling System automatically monitors your AI agent fleets and proposes scaling adjustments (expansion or contraction) based on performance metrics. All scaling decisions require your explicit approval to maintain control over costs.
How It Works
Automatic Monitoring
The system continuously monitors your fleets for:
- **Success rate**: Percentage of successfully completed tasks
- **Latency**: Average time to complete tasks
- **Throughput**: Tasks completed per minute
- **Agent utilization**: How actively your agents are working
Scaling Triggers
**Expansion is proposed when:**
- Success rate drops below 85%
- Task latency exceeds 20 seconds
- Fleet is at capacity with pending work
**Contraction is proposed when:**
- Agent utilization drops below 30%
- Success rate is excellent (>95%)
- Cost optimization opportunities exist
The Approval Process
- **Detection**: System detects scaling need
- **Proposal**: Scaling proposal created with:
- Current vs. proposed fleet size
- Reason for proposal
- Cost estimate
- Current performance metrics
- **Notification**: You receive notification (in-app, email, or Slack)
- **Decision**: You approve or reject the proposal
- **Execution**: If approved, scaling is executed automatically
- **Verification**: You can verify the results
Viewing Proposals
Navigate to **Analytics > Scaling Proposals** to see:
- Pending proposals awaiting your decision
- Recent proposal history
- Current fleet status
Proposal Details
Each proposal shows:
- **Proposal type**: Expansion (red) or Contraction (green)
- **Fleet size change**: Current → Proposed
- **Reason**: Why this proposal was created
- **Cost estimate**: Projected additional cost (or savings)
- **Performance metrics**: Current success rate, latency, throughput
- **Expiration**: When the proposal expires (24h for expansion, 7 days for contraction)
Approving Proposals
- Click "View Details" on a proposal
- Review the cost breakdown:
- Hourly cost
- Daily cost
- Weekly projection
- Monthly projection
- Optionally add an approval note
- Click "Approve" to execute scaling
**After approval:**
- System recruits additional agents (expansion) or removes idle agents (contraction)
- Changes take effect within 1-2 minutes
- You'll receive confirmation when complete
Rejecting Proposals
- Click "View Details" on a proposal
- Enter a reason for rejection (required)
- Click "Reject"
**After rejection:**
- Similar proposals will be suppressed for 4 hours (hysteresis)
- Your feedback helps improve future recommendations
Plan Limits
Each subscription tier has fleet size limits:
| Plan | Max Fleet Size |
|---|---|
| Free | 2 agents |
| Solo | 5 agents |
| Team | 10 agents |
| Enterprise | 25 agents |
Exceeding Limits
If you need more agents than your plan allows:
- **Upgrade your plan** for permanent increase
- **Request overage approval** for temporary expansion
Enterprise plans get automatic 2x overage (up to 50 agents).
Cost Estimates
Scaling proposals include cost estimates based on:
- **Agent-hour cost**: Historical average per agent
- **Token usage**: Expected token consumption
- **Duration**: How long expansion will be active
**Note:** Estimates are projections based on historical data. Actual costs may vary.
Budget Controls
The system prevents scaling that would:
- Exceed your monthly budget limit
- Violate your plan's fleet size limit
- Require explicit approval for overages
You can view your remaining budget in **Settings > Billing**.
Best Practices
When to Approve Expansion
- **Approve** when: Team is overloaded, deadlines are at risk, or revenue impact is high
- **Reject** when: Degradation is temporary, or you can optimize existing agents first
When to Approve Contraction
- **Approve** when: Workload has permanently decreased, or to reduce costs
- **Reject** when: Low utilization is temporary, or busy period is expected soon
Cost Optimization Tips
- **Review proposals promptly**: Don't let urgent expansion requests expire
- **Monitor utilization**: Contraction proposals save money
- **Set appropriate budgets**: Prevents surprise overages
- **Use plan limits wisely**: Choose the right tier for your workload
Troubleshooting
Proposal Not Showing
- Check that fleet is active (not completed/failed)
- Wait 5 minutes for monitoring cycle
- Ensure metrics are being recorded
Scaling Failed
- Check if proposal expired (24h for expansion, 7 days for contraction)
- Verify budget limits haven't changed
- Check fleet status (must be active)
Unexpected Proposals
- Review the performance metrics that triggered the proposal
- Check if there's a genuine issue or temporary fluctuation
- Rejection with reason helps calibrate future proposals
API Access
Developers can interact with scaling via API:
GET /api/v1/fleet/scaling/proposals- List proposalsPOST /api/v1/fleet/scaling/proposals/{id}/approve- Approve proposalPOST /api/v1/fleet/scaling/proposals/{id}/reject- Reject proposal
See Scaling API Documentation for details.
Related Features
- HITL System - Human-in-the-loop patterns
- Budget Management - Budget tracking and limits
- Fleet Analytics - Performance monitoring
---
**Last updated:** 2026-03-31
**Phase:** 242 - Adaptive Scaling System